Introduction: PASS

The CDO Data Science team developed a geographic information system, Potential Accessibility Software Service (PASS). PASS offers an advanced quantitative approach to measure how spatially accessible population demand is to a given service.

Spatial accessibility is the consideration of how physical and social space and place affect how a population can traverse through it to access a given service. Though abstract in nature, it can be measured through considerations like where potential population demand is located, the geographic distance to get from the population location to the service locations offered, the supply at the service locations, as well as the probability of a population going to one service location over another based on the capacity. PASS uses the enhanced 3-step floating catchment methodology to accomplish this, which is further explained below.

PASS lets you select a geographic area of interest by panning and zooming on the interactive map, and lets you define the parameters to model spatial accessibility to better reflect Canada’s diverse society. For example, individuals living in urban areas versus rural areas have different assumptions and considerations for how to access a service.

Spatial Accessibility

To understand how a phenomenon traverses through space, spatial interaction methodologies model spatial and non-spatial physical and social observations. For example, road networks and driving observations are leveraged to predict traffic flows; while demographic and social media data might be used to understand information flows. Spatial interaction theory can also be used for quantifying access through measuring spatial accessibility, that is the availability and accessibility of a given service, often presented as an index (Ma et al. 2018).

Access is a multidimensional construct that depends on both objective (e.g., financial, social and geographical) and subjective (e.g., local knowledge) determinants. Moreover, there are two perspectives of access, potential and revealed, which can be measured through assessing non-spatial and spatial barriers. Revealed access relies on data actually collected from a service, such as to better locate their services for their known customers (Bauer and Groneberg 2016). On the other hand, potential access relies on population data (e.g., Census) that represents those who likely need or want to have access to a service, such as to assure their service covers clients of interest (Joseph and Bantock 1982). Whether potential or revealed, barriers to access include the following: availability, accessibility, accommodation, affordability and acceptability. Spatial accessibility is the commonly described as the availability and accessibility of a given service (Bauer and Groneberg 2016).

The use of population distributions as a proxy for demand, instead of actual use, can result in inaccurate estimates of demand (as not everyone makes use of the service or at the same frequency). However, since utilization data is rarely available and quality assured, the literature for calculating spatial accessibility is often focused on making use of population data, resulting in estimates of potential accessibility. Moreover, in the context of government services, though it is important to assess a given service is accessible to its consumers, it is arguably more important to assure a government service is accessible to those who should and need to be receiving it.

Literature Review: Current Methods

Public Health research has led in the development and application of quantifying potential spatial accessibility, identifying population areas (i.e., Census geographic boundaries like Census Tracts) that are underserviced to health practioners. This research, however, has expanded into other socioeconomic applications, such as for businesses determining where they should develop a new service location.

Regional Availability Model (Provider-to-Population Ratios)

Though quantifying access is a complex problem, there is a simpler mechanism for measuring potential spatial accessibility through using the regional availability model. This model calculates the ratio of total supply provided by a given service to the total population in a given geographic unit, also known as provider-to-population ratios (PPRs) (Bauer and Groneberg 2016; Paez et al. 2019). The geographic unit for calculating these ratios is usually a Census boundary, like the Census Subdivision. Though computationally simple to calculate and potentially informative for aggregated analyses (hence regional), this model assumes the population served by a site is completely contained within the given area and that all people within the area have equal access to the service. This does not reflect reality, however, as individuals move across space non-uniformly, especially when utilizing granular geographic areas like Census Tracts (ibid.; Ma et al. 2018). Furthermore, the modifiable areal unit problem (MAUP) occurs, which is when a modification of the areal unit (e.g., Census Tracts to Postal Code areas) yields different results as geographic units are designed for their intended uses (e.g., Statistics Canada Census and Canada Post).

Gravity Model

To avoid confining the analysis within geographic units, potential spatial accessibility has been quantified through modifications of Newton’s model of gravity and the use of floating catchment areas to account for how distance influences demand (Hansen 1959; Joseph and Bantock 1982; Wang and Luo 2003). The gravity model assumes a given population’s accessibility to a service decreases as travel distance to that service increases; moreover, for a given distance, accessibility will rise as the magnitude of service available at the site increases. A key advantage of gravity-based approaches over the simpler regional availability models is that distance decay is considered, meaning your accessibility to a service decreases the further you are from it (which should result in more realistic estimates). The equation below details the basic gravity-based spatial accessibility model (Wan et al. 2012; Luo et al. 2014):

\[ A_i= \sum_{j=1}^n \frac{S_j f(d_i,_j)}{\sum_{k=1}^m P_k f(d_i,_j)} (1) \]

\(A_i\) = spatial accessibility at population demand location \(i\)

\(m\) = total number of demand locations

\(n\) = total number of service locations

\(S_j\) = supply at a given service location \(j\) (e.g.,Service Canada)

\(P_k\) = population count at a given demand area location \(k\)

\(d_i,_j\) = travel distance from location \(i\) to \(j\),where as

\(f(d)\) represents the* generalize distance decay function

The gravity model is essentially the PPR, but with a generalized \(f(d)\) distance decay function (also known as the distance impedance function) that determines how distance influences accessibility. The distance decay function has three common forms: inverse-power function \(d^{-β}\), exponential function \(e^{-βd}\) and the Gaussian function \(e^{-d^2/β}\) (Kwan 1998). Each of these functions take, as input, the distance between two objects. The distance calculations can be as simple as a Euclidean distance (straight line) from population location \(i\) to service location \(j\); or it can be more complex, considering physical barriers (e.g., rivers, elevation) and/or different transportation networks (e.g., road, public transit).

Floating Catchment Area (FCA) Methods: 2SFCA and Enhanced 3SFCA Models

Although this model accounts for demand and supply and how travel distance between the potential consumers and providers can influence accessibility, it cannot be interpreted intuitively, making it difficult to select a suitable distance decay function and decay coefficient (Wan et al. 2012; Luo et al. 2014; Bauer and Groneberg 2016). As such, floating catchment area methods were introduced as additional steps to more accurately implement the gravity-based model as a solution for measuring spatial accessibility; notably, the 2-Step Floating Catchment Area (2SFCA) methodology, introduced by Wang and Luo (2003). A catchment area is a buffer zone surrounding a given point location defined by a distance or time threshold. A popular example is a school catchment that determines students’ attendance eligibility. A catchment could be a simple circle buffer around a given point location (calculated by Euclidean distance) or a much more complex polygon that represents actual travel time or distance based on transportation networks. Figure 1 demonstrates both catchment areas that, for example, represent a travel time of 60 minutes by car. Catchments areas can represent distance and distance decay either through a uniform, piece-wise or continuous time/distance surface; moreover, if distance is calculated with transportation networks, the catchments better represent the physical and social landscape.

Figure 1: Example of catchment areas, either calculated with Euclidean distance to make a circle buffer; or, to better reflect the geographic landscape, with transportation network distance to make an irregular buffer. The different colours for the irregular catchment area demonstrates sub-zones based on same commute time thresholds, such as all areas within 5, 15, 25, and 45 minutes. This method attempts to account for distance decay with a piece-wise surface.

Figure 1: Example of catchment areas, either calculated with Euclidean distance to make a circle buffer; or, to better reflect the geographic landscape, with transportation network distance to make an irregular buffer. The different colours for the irregular catchment area demonstrates sub-zones based on same commute time thresholds, such as all areas within 5, 15, 25, and 45 minutes. This method attempts to account for distance decay with a piece-wise surface.

Essentially the 2SFCA methdology broke down calculating potential spatial accessibility into two steps: 1) calculate the PPR, and 2) sum the PPRs within a given population location’s catchment area to obtain the measure of accessibility. Each step is further explained below.

Step 1: For each service location \(j\), generate its catchment area \((D_j)\) by finding all populations points that are within the travel time or distance threshold \(d_0\) – that is all population demand location \(k\) such that \(d_(k,j)<d_0\). Then calculate the PPR (\(R_j\)) for that given service site as the ratio of its supply to the sum of the populations within its catchment area.

\[ R_j = \frac{S_j}{\sum_{k∈D_j}P_k} (2) \]

Step 2: For each population demand location \(i\), generate its catchment area (\(D_i\)) using the same threshold (\(d_0\)). Then calculate the accessibility for that location (\(A_i^F\)) as the sum of PPR of all service sites within the catchment – that is all \(j\) such that \(d_{i,j} < d_0\).

\[ A_i^F = \sum_{j∈D_i}R_j (3) \]

Though this approach is relatively simple to implement with the right geographic information system in place, this model neglects to account for the distance decay within each catchment area because it considers the decay as binary. Hence, population locations have equal access within a catchment, while those outside of the catchment area are considered inaccessible (Wan et al. 2012; Luo et al. 2014; Paez et al. 2019). Although limitations exist, the 2SFCA paved the way for various modifications, enhancements or additional steps for calculating potential spatial accessibility, known as the floating catchment area (FCA) family (Bauer and Groneberg 2018). For example, the enhanced 2SFCA (E2SFCA) attempts to account for distance decay by adding sub-zones within the catchment area (Luo and Qi 2009). (Sub-zones are demonstrated in Figure 1.) In another case, to reduce demand inflation, the 3SFCA method was introduced by Wan et al. (2012). This method assumes that a population’s demand on a service site is influenced by the availability of other nearby sites. In other words, when more options are available, an individual’s demand on a single site decreases. To account for this, Wan et al. introduced the use of selection weights of a potential population demand location on a service location. Figure 2 below demonstrates how the selection weight is considered for the 3SFCA method.

Figure 2: Example scenario to illustrate the limitations of the 3SFCA model.

Figure 2: Example scenario to illustrate the limitations of the 3SFCA model.

Each service location (A, B, C), with their supply value represented in parenthesis, have their distance decay function calculated and provided beside the lines between population location \(i\). The selection weight of \(i\) for service site A would be 0.3 / (0.5+0.3+0.4) = 0.25, B would be 0.33 and C would be 0.42. Then, to calculate the adjusted demand, A would be 0.25 x 0.3 x \(P_i\) = 0.075\(P_i\) while B would be 0.13\(P_i\) and C would be 0.21\(P_i\). Service site A has the smallest adjusted demand, yet has the largest capacity, a factor that certainly plays a role in someone’s decision-making.

Though this modification accounts for demand inflation observed in the 2SFCA approach through incorporating selection weights, there are limitations with the 3SFCA, mainly the selection weight calculations do not consider how sites’ supply can influence an individual’s decision. With this in mind, the enhanced 3SFCA introduced the integration of the Huff model (Luo 2014). It is this method that was initially tested by CDO to measure potential spatial accessibility to Service Canada, and other related points of service (POS). The model is illustrated as such, for population location \(i\) visiting service location \(j\):

\[ Prob_i = \frac{C_id_i,_j^{-β}}{\sum_{j∈D_i}C_id_i,_j^{-β}} (4) \] \(C_i\) = capacity/attractiveness of service location

\(β\) = decay coefficient

The probability of population location \(i\) visiting service location \(j\) depends on the attractiveness/capacity of other service locations within population location \(i\) catchment area, with considerations of distance decay. The Huff model is usually implemented with the inverse-power distance decay function, \(d^{-β}\) (ibid.). For \(β\) parameter, the distance decay coefficient, if there is business and local knowledge that the potential population demand are willing to commute further distances for the service, then \(β\) should be smaller. (CDO has been reviewing further literature to determine a more qunatitative approach for selecting the distance decay function and coefficient for the enhanced 3SFCA method.) \(C_i\) represents the attractiveness/capacity of the service. For childcare, this could be the amount of available seats or the number of employees. Since the Huff model incorporates the distance decay function, the surface is assumed continuous rather then stepwise; furthermore, capacity or other factors that could be considered attractive to consumers can be accounted for, providing a more realistic estimation of demand. With the probability values representing your selection weights, and \(W\) representing the distance decay weight, equation 2 and 3 (steps 2 and 3 of the FCA method) are modified as such:

\[ R_j = \frac{S_j}{\sum_{k∈D_j}Prob_kjP_kW_k,_j} (5) \]

\[ A_i^F= \sum_{_j∈D_i}Prob_ij R_j W_ij (6) \]

So, referring back to Figure 2, utilizing the Huff model to calculate the weights, service location A would have a probability of (20 x 0.3) / (20 x 0.3 + 10 x 0.4 + 4 x 0.5) = 0.5 while B would have a probability of 0.33 and C a probability of 0.17. The adjusted demand would then be 0.5 x 0.3 x \(P_i\) = 0.15\(P_i\), whereas B would be 0.132\(P_i\) and C would be 0.085\(P_i\), thus site A has the highest adjusted demand and C the lowest because capacity is now considered. This highlights how demand is more accurately respresented versus the demonstration provided above with the 3SFCA. Evidently, through the academic evaluations and modifications of various FCA methods, testing different uses of weights and distance decay, the Huff model offers a more realistic measure for calculating spatial accessibility from population locations to POS.

Data

Regardless of the trade-offs between the various FCA methods, all approaches require the same input data:

  1. The population geographic locations that represents the potential demand as longitude and latitude geographic points, which could be derived from centroids of geographic boundaries (e.g., Dissemination Area boundaries and centroids);
  2. In addition to where the potential demand population resides, the population counts per each population location is needed;
  3. The service geographic locations as longitude and latitude geographic points;
  4. Moreover, at least one variable to represent the supply \(S_j\) and capacity \(C_i\) of each service location, this could be a uniform value;
  5. Last, a distance matrix, which is the distance or time calculations between all population and service locations within a distance or time threshold.

The remainder of this report describes implementation of this enhanced 3SFCA methodology for Service Canada, and how it can be repurposed and improved for other use cases.

Implementation: Measuring Potential Spatial Accessibility for Service Canada’s Points of Service

Service Canada’s regions every year receive multiple requests to relocate existing and locate new Service Canada, Passport Office and Schedule Outreach sites, also known as points of service (POS). Current internal approaches for assessing accessibility to POS vary and include analyzing the number (and location) of POS relative to the size of the surrounding population for a given geographic area. These measures can also be augmented by considering multiple types of POS as well as varying demographic information (e.g., age distribution, income level, indigenous identity). Although very informative, these approaches do not provide a standard quantitative measure of accessibility (specifically potential spatial accessibility) that can be used to more readily (and systematically) investigate how best to locate POS to fulfill Service Canada’s mandate. As such, after the request from the Quebec region’s Business Expertise group, CDO initiated a pilot project to investigate and operationalize the creation and use of a spatial accessibility measure by means of a gravity-based approach (specifically the enhanced 3SFCA model was tested). The resulting spatial measure aims to provide a more nuanced and better foundation to quantify POS accessibility and thus improve identification of underserved areas.

Assumptions

The enhanced 3SFCA solution was tested with 6 different decay coefficients to assess \(β\) sensitivity. The following lists some of the assumptions and data decisions that were made for this pilot, something that needs to be consider for any implementation of this model:

  • Research area: 2016 Montreal Census Metropolitan Area (CMAUID 462)
  • Population demand locations: 2016 Dissemination Area (DA) population weighed mean centroids, calculated from Dissemination Block total population values
  • Service locations: 2019 Points of Services (POS), including Service Canada, Passport and Schedule Outreach sites
  • Service supply and capacity: Constant, with all POS have an attractiveness and supply of 1, as other variables to represent these factors were not available during this iteration
  • Decay function: inverse-power, testing with \(β\) = 0.5, 0.8, 1.0, 1.3, 1.5, 2.0
  • Travel radius threshold: 60 minutes (by car on road)

0.5, 0.8, 1.0, 1.3, 1.5, 2.0 distance decay coefficients (\(β\)) were tested as the choice of these values in the literature appears to often be arbitrary. Ideally, the coefficient value would be informed from actual user preference data (e.g., survey data on people’s willingness to travel); however, appropriate data could not be found, so several coefficient values were tested. Similar to the procedures suggested by Kwan (1998), coefficient values were evaluated in regards to how they affect the behaviour of the resulting accessibility scores. Increasing the coefficient values lead to a steeper curve in the decay distribution, causing drop offs in accessibility at a travelling time of 10-20 minutes from POS. Considering the initial assumption of 60-minute catchment areas, the larger coefficient values were deemed inappropriate in modeling decay for this exercise. Decreased coefficients were tested to allow for longer tails in the distribution of accessibility scores and therefore more Dissemination Area coverage. Testing was stopped once the distribution began to flatten and accessibility scores started becoming uniform. This method of coefficient testing and selection was used to maximize coverage while still maintaining an informative amount of variance in accessibility scores; for example, to reduce the chances of nearby population locations (e.g., 10-20 minutes away from a POS) having accessibility scores of 0.

To learn more about the data sources and limitations, please refer to the original report for this case study.

Results

The following link downloads a HTML that can be opened on your internet browser (ideally FireFox) to visualize the results of the 3SFCA model as an interactive map for each coefficient: 3SFCA Model Map Results for Montreal. If the interactive map is not accessible, below are two images visualizing the results for decay coefficients \(β\) 0.5 and 1.3.

Testing different coefficients demonstrated that the model is behaving accordingly. For a given distance, higher decay rates relate to people having a lower willingness to travel to access a POS. This can be seen in the interactive map, as the coefficient value increases, the distribution of accessibility scores becomes more extreme visually. This corresponds with the descriptive statistics provided in the table below, which shows that as the coefficient increases, the global Moran’s I decreases, indicating more geographic randomness, and the standard deviation increases, indicating more variance.

Decay Coefficients Min Max Mean Standard Deviation Moran’s I
0.5 0.00252 0.06828 0.00537 0.00231 0.66406
0.8 0.00134 0.36746 0.00546 0.00714 0.38471
1 0.00077 0.69681 0.00553 0.01241 0.28729
1.3 0.00027 1.19459 0.00561 0.02104 0.21216
1.5 0.00013 1.44072 0.00565 0.0264 0.18729
2 0.00001 1.72171 0.00572 0.03861 0.15634
Figure 3: Results after using a decay coefficient (β) of 0.5. Darker red indicates that the Dissemination Areas, which are the population demand location areas, are more accessible to the Montreal POS. Lighter coloured Disseminaton Areas indicate underserviced areas as they represent population location areas with low accessibility and availability.

Figure 3: Results after using a decay coefficient (β) of 0.5. Darker red indicates that the Dissemination Areas, which are the population demand location areas, are more accessible to the Montreal POS. Lighter coloured Disseminaton Areas indicate underserviced areas as they represent population location areas with low accessibility and availability.

Figure 4: Results after using a decay coefficient (β) of 1.3. In this case, the results indicate more Dissemination Areas with lower accessibility scores as β is higher, which assumes the population demand for POS are less willing to commute further distances.

Figure 4: Results after using a decay coefficient (β) of 1.3. In this case, the results indicate more Dissemination Areas with lower accessibility scores as β is higher, which assumes the population demand for POS are less willing to commute further distances.

Regardless of the variations in accessibility scores due to different decay coefficients, some patterns persist amongst all outputs. Northern Laval, Mirabel and Mont-Saint-Hilaire regions have lower spatial accessibility scores relative to other regions. Thus, it could be recommended having new POS situated within these regions. That said, since the geographic scope for this initial test was only the CMA of Montreal, the edge effect may be responsible for low scores around Mont-Saint-Hilaire and Mirabel. If the scope was scaled to all of Quebec, results could look different. In addition, on the island of Montreal there are also areas of decreased accessibility. The areas of La Salle and the southeast portion of Riviere-des-Prairies-Point-aux-Trembles have both low accessibility scores and statistically significant clustering of these low scores. This is the case even with a high number of POS on the island. It could be advisable to move a current POS closer to these areas to increase accessibility.

Reflection: Limitations

With a modifiable 3SFCA model available for use and scale, we can start investigating further literature to consider different factors that impact spatial accessibility, such as demographics, modes of transportation, time, and regional differences (e.g., urban versus rural). Moreover to improve how a distance decay function and subsequent coefficients are selected.

It is undecided how best to include demographic data. For instance, multiple accessibility scores could be individually produced, one for each group of interest, based on the count values per geographic unit. Alternatively, based on literature, a weighted population metric could be used in order to summarize accessibility over the entire population. This approach essentially simply biases the model towards specific subpopulations, thus policy should influence what gets weighted. Since CDO has already operationalized the model, CDO’s priority is allowing users of this model to be able to calculate accessibility scores for multiple demographic population counts.

Aside from demographics, different modes of transportation could also be considered, particularly within urban areas public infrastructure for biking, walking, and public transit exists. Literature on this topic has not been reviewed yet, but CDO is investigating and testing the inclusion of public transportation, allowing to calculate scores per each mode of transportation. The time distance threshold would have to change per different transportation because how long an average person is willing to commute will vary by mode of transportation as well whether they are within an urban or rural area.

Conclusion: Next Steps

The case study demonstrated the ability to leverage different geographic information and create a scalable and generalizable approach to better model potential spatial accessibility to Service Canada POS. Though there are limitations noted, CDO has created a mechanism for calculating spatial indices that can provide a better and more nuanced foundation to quantify accessibility for a service and thus improve identification of underserved areas. Through implementations of the enhanced 3SFCA method, the case study demonstrated the ability to identify underserved areas, specifically La Salle and the southeast portion of Riviere-des-Prairies—Point-aux-Trembles as well as Mirabel, Mont-Saint-Hilaire and northern Laval regions. Since this initial phase, the CDO has been in communcations with the Quebec, Ontario and W/T regions to compare the enhanced 3SFCA methdology with their current approaches to identifying underserviced locations, with the eventual goal to include and standardize this process amongst the regions.

As CDO sees the potential of this model to be repurposed for other use cases, such as for Early Learning Child Care (ELCC), we have been implementing the model with geographic scale and data flexibility in mind. A web app, the Potential Accessibility Software Service (PASS), is currently being internally developed by CDO to operationalize the enhanced 3SFCA model. PASS, can be set up for any use case for calculating and producing a potential spatial accessibility index for a given service. Figure 5 below provides a visual of what the prototype will look like for its users.

Figure 5: Image to demonstrate PASS.

Figure 5: Image to demonstrate PASS.

Work Cited